Unsupervised Reinforcement Learning in Multiple Environments

نویسندگان

چکیده

Several recent works have been dedicated to unsupervised reinforcement learning in a single environment, which policy is first pre-trained with interactions, and then fine-tuned towards the optimal for several downstream supervised tasks defined over same environment. Along this line, we address problem of class multiple environments, interactions from whole class, any environment class. Notably, inherently multi-objective as can trade off pre-training objective between environments many ways. In work, foster an exploration strategy that sensitive most adverse cases within Hence, cast maximization mean critical percentile state visitation entropy induced by environments. Then, present gradient algorithm, alphaMEPOL, optimize introduced through mediated Finally, empirically demonstrate ability algorithm explore challenging classes continuous show greatly benefits w.r.t. scratch.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reinforcement Learning in Complex Environments Through Multiple Adaptive Partitions

The application of Reinforcement Learning (RL) algorithms to learn tasks for robots is often limited by the large dimension of the state space, which may make prohibitive its application on a tabular model. In this paper, we introduce LEAP (Learning Entities Adaptive Partitioning), a model-free learning algorithm that uses overlapping partitions which are dynamically modified to learn near-opti...

متن کامل

Unsupervised and reinforcement learning in neural networks

2.3. Initialize both Q-values at 2 (optimistic). Assume that, as in in the first part, in the first round you get for both actions the reward. Update your Q values once with η = 0.2. Suppose now that in the following rounds, you choose actions a1 and a2 alternatingly and update the Q-values with a very small learning rate (η = 0.001). How many rounds does it take on average, until the maximal Q...

متن کامل

Reinforcement Learning with Unsupervised Auxiliary Tasks

Deep reinforcement learning agents have achieved state-of-the-art results by directly maximising cumulative reward. However, environments contain a much wider variety of possible training signals. In this paper, we introduce an agent that also maximises many other pseudo-reward functions simultaneously by reinforcement learning. All of these tasks share a common representation that, like unsupe...

متن کامل

Unsupervised Basis Function Adaptation for Reinforcement Learning

When using reinforcement learning (RL) algorithms to evaluate a policy it is common, given a large state space, to introduce some form of approximation architecture for the value function (VF). The exact form of this architecture can have a significant effect on the accuracy of the VF estimate, however, and determining a suitable approximation architecture can often be a highly complex task. Co...

متن کامل

Unsupervised Multiple Kernel Learning

Traditional multiple kernel learning (MKL) algorithms are essentially supervised learning in the sense that the kernel learning task requires the class labels of training data. However, class labels may not always be available prior to the kernel learning task in some real world scenarios, e.g., an early preprocessing step of a classification task or an unsupervised learning task such as dimens...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2022

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v36i7.20754